Goto

Collaborating Authors

 semantic image synthesis



Supplementary Material for Semantic Image Synthesis with Unconditional Generator JungWoo Chae

Neural Information Processing Systems

This process enables the value (feature maps) to be rearranged (through a weighted sum) to align with the form of the query, thereby reflecting their strong correspondence. The input noise is removed because its stochasticity slows down the training. Given the need for balancing between high correspondence and image quality, we empirically set the weights of our loss terms. To demonstrate the influence of the additional losses introduced in our method, we provide both quantitative and qualitative ablations in Figure S2 and S3, respectively. Nonetheless, caution is warranted when overly increasing the number of clusters.




SemFlow: Binding Semantic Segmentation and Image Synthesis via Rectified Flow

Neural Information Processing Systems

For image synthesis, we propose a finite perturbation approach to enhance the diversity of generated results without changing the semantic categories. Experiments show that our SemFlow achieves competitive results on semantic segmentation and semantic image synthesis tasks.


Supplementary Material for Semantic Image Synthesis with Unconditional Generator

Neural Information Processing Systems

This process enables the value (feature maps) to be rearranged (through a weighted sum) to align with the form of the query, thereby reflecting their strong correspondence. The input noise is removed because its stochasticity slows down the training. Given the need for balancing between high correspondence and image quality, we empirically set the weights of our loss terms. To demonstrate the influence of the additional losses introduced in our method, we provide both quantitative and qualitative ablations in Figure S2 and S3, respectively. Nonetheless, caution is warranted when overly increasing the number of clusters.




Reviews: Learning to Predict Layout-to-image Conditional Convolutions for Semantic Image Synthesis

Neural Information Processing Systems

This paper proposes a strongly conditional network for generating images from semantic maps. How impacted is this network by small changes in the input map - for example given 3 sequential frames of a video (as segmentation maps) - is the model consistent in assigning colors and structures? Or do small changes in the geometry of the semantic objects have a large impact on the output? This is mostly curiousity, as having smoothness inherent in the model has large potential for video applications. Some amount of qualitative results comparing to other models were shown, but showing the important regions of the input conditioning, and the influence of input perturbations on the model output could also lead to valuable insight - using something like GradCAM or related methods may be possible for checking the importance of input features.


Reviews: Learning to Predict Layout-to-image Conditional Convolutions for Semantic Image Synthesis

Neural Information Processing Systems

All reviewers are in unanimous agreement for acceptance. The paper has a number of interesting contributions, mostly empirical, in utilizing a conditional weights network and feature pyramids. As promised in your rebuttal, please release code before acceptance.